Parallel Retrieval of Dense Vectors in the Vector Space Model
نویسندگان
چکیده
Modern information retrieval systems use distributed and parallel algorithms to meet their operational requirements, and commonly operate on sparse vectors. But dimensionality-reducing techniques produce dense and relatively short feature vectors. Motivated by this relevance of dense vectors, we have parallelized the vector space model for dense matrices and vectors. Our algorithm uses a hybrid partitioning splitting documents and features and operates on a mesh of hosts holding a block partitioned corpus matrix. We show that the theoretic speed-up is optimal. The empirical evaluation of an MPI-based implementation reveals that we obtain a super-linear speed-up on a cluster using Nehalem Xeon CPUs. A version of this report has been published as “Tobias Berka and Marian Vajteršic: Parallel Retrieval of Dense Vectors in the Vector Space Model. Computing and Informatics (CAI), 2, 2011.”
منابع مشابه
Second dual space of little $alpha$-Lipschitz vector-valued operator algebras
Let $(X,d)$ be an infinite compact metric space, let $(B,parallel . parallel)$ be a unital Banach space, and take $alpha in (0,1).$ In this work, at first we define the big and little $alpha$-Lipschitz vector-valued (B-valued) operator algebras, and consider the little $alpha$-lipschitz $B$-valued operator algebra, $lip_{alpha}(X,B)$. Then we characterize its second dual space.
متن کاملVector Space semi-Cayley Graphs
The original aim of this paper is to construct a graph associated to a vector space. By inspiration of the classical definition for the Cayley graph related to a group we define Cayley graph of a vector space. The vector space Cayley graph ${rm Cay(mathcal{V},S)}$ is a graph with the vertex set the whole vectors of the vector space $mathcal{V}$ and two vectors $v_1,v_2$ join by an edge whenever...
متن کاملHoph Hypersurfaces of Sasakian Space Form with Parallel Ricci Operator Esmaiel Abedi, Mohammad Ilmakchi Department of Mathematics, Azarbaijan Shahid Madani University, Tabriz, Iran
Let M^2n be a hoph hypersurfaces with parallel ricci operator and tangent to structure vector field in Sasakian space form. First, we show that structures and properties of hypersurfaces and hoph hypersurfaces in Sasakian space form. Then we study the structure of hypersurfaces and hoph hypersurfaces with a parallel ricci tensor structure and show that there are two cases. In the first case, th...
متن کاملDimensions of Meaning
The representation of documents and queries as vectors in a high-dimensional space is well-established in information retrieval 1]. This paper proposes to represent the semantics of words and contexts in a text as vectors. The dimensions of the space are words and the initial vectors are determined by the words occurring close to the entity to be represented which implies that the space has sev...
متن کاملAnalysis of Vector Space Model in Information Retrieval
Information retrieval is great technology behind web search services. In information retrieval, it is common to model index terms and documents as vectors in a suitably defined vector space. The vector space model is one of the classical and widely applied retrieval models to evaluate relevance of web page. The retrieval operation consists of computing the cosine similarity function between a g...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computing and Informatics
دوره 30 شماره
صفحات -
تاریخ انتشار 2011